Methods and systems for ransomware protection

ABSTRACT

Described are methods and systems for detecting a ransomware process acting on files organized in folders of a file system. A subfolder path is identified for each file acted upon by a suspect process and the paths are combined into a number of unique paths for the suspect process. If the number of unique paths exceeds a threshold, the suspect process is categorized as malware. The methods and systems also include a hook and a filter to prevent shadow-copy deletions that would otherwise interfere with file recovery.

TECHNICAL FIELD

Embodiments of the present disclosure relate generally to data processing and, more particularly, but not by way of limitation, to methods and systems for protecting computer systems from malware (a portmanteau for malicious computer software).

BACKGROUND

“Ransomware” is a type of malware from cryptovirology that attacks a victim's computer by encrypting their files or otherwise denying file access until a ransom is paid. Ransomware attackers may also threaten to publish the victim's data unless a ransom is paid. In the computer context, a file is a virtual component stored in memory to represent e.g. an image, text, video, or a computer program. A ransomware process, an instance of a ransomware program executing on a computer, can overwrite and delete files using calls to the computer's operating system.

Anti-virus (AV) and Next-generation Anti-virus (Nextgen AV) software detect malware using techniques like signature-based detection, traffic-based detection, and behavioral detection. Signature-based detection requires malware with a known signature and is thus ineffective against recent strains or targeted attacks. Traffic-based detection looks for communication patterns common to malware to detect recent strains but is relatively slow and inefficient. Behavior-based detection evaluates a process' actions or intended actions for suspicious behavior. Though promising, behavioral detection techniques suffer from high false-positive rates and concomitant productivity loss. There is therefore a need for improved malware detection and mitigation.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1 depicts a malware detection system 100 in accordance with one embodiment.

FIG. 2 depicts detection engine 130 in accordance with an embodiment that includes a behavioral detection unit 200, a file-traversal-information detection unit 205, a true-root-path detection unit 210, and a neural-network-based classifier 215.

FIG. 3 is a flowchart 300 of a method of finding the true-root folder(s) for a process in accordance with one embodiment of detection unit 210.

FIG. 4 is a flowchart 400 depicting a method of detecting ransomware in accordance with one embodiment of system 100 of FIG. 1 .

FIG. 5 depicts a ransomware-protection system 500 instantiated on a computer, a user endpoint that executes processes using one or more processors and memory to modify files and store them in secondary storage 505.

DETAILED DESCRIPTION

FIG. 1 depicts a malware detection system 100 in accordance with one embodiment. System 100 includes a user endpoint 105 (e.g. a desktop computer, laptop, or mobile device) connected to a server 110 via a network 115 (e.g. the Internet). Endpoint 105 detects ransomware by analyzing process events (notifications related to a process start and end) and file events (notifications related to file operations, such as Read, Write, Delete, Rename, Directory create, and Directory traversed). In this context, a “process” is a single unit of execution spawned from an executable file, or “binary,” and an “event” is a single unit of notification corresponding to changes made by a process.

User endpoint 105 includes a raw-data queue 120 that relays incoming events to an event reader 125 for interpretation. A detection engine 130 scans an event sequence corresponding to a given file to check if it is a ransomware threat. A high-priority queue 135 acts as a buffer to store the events deemed threatening before passing them on to a response engine 140, which performs any one of three responses as configured by the system administrator. The three responses are 1) Detect, 2) Detect and kill, and 3) Detect, kill, and back-up recovery. Response engine 140 notifies server 110 about events posing ransomware threats and about the suitable configured response performed.

Events from event reader 125 also flow through a filtering unit 145, which extracts necessary events and moves them to a low-priority queue 150 on their way to a storage unit 155 for future reference in case of any investigation. Examples of necessary events are File_Write,

File_Rename, and File_Delete, which are necessary to note because they impact stored files. Unnecessary events, such as File_Read and Dir_Traversed, are filtered out to save space because they are not needed to recall changes. Server 110 may also insist on storing all the events posing ransomware threats to endpoint 105 for future investigation.

FIG. 2 depicts detection engine 130 in accordance with an embodiment that includes a behavioral detection unit 200, a file-traversal-information detection unit 205, a true-root-path detection unit 210, and a neural-network-based classifier 215.

Behavioral detection unit 200 identifies behavioral patterns in event sequences for each process under consideration. These behavioral patterns include:

-   -   ORIGINAL_FILE_OVERWRITE—The process continuously navigates         multiple folders and overwrites files with encrypted content.     -   ORIGINAL_FILE_DELETED_AND_NEW_ENCRYPTED_FILE_CREATED—The process         continuously navigates multiple folders, deleting files and         creating new files with encrypted content.     -   ORIGINAL_FILE_OVERWRITE_AND_RENAMED—The process continuously         navigates multiple folders, overwrites files with encrypted         content, and renames the overwritten files.     -   ORIGINAL_FILE_RENAMED_AND_OVERWRITE—The process continuously         navigates multiple folders, renames the files, and overwrites         the renamed files with encrypted content.

The behavioral pattern can also be one of the following based on the process and corresponding binary file:

-   -   Single binary and Single process: A Single process will discover         files and encrypt them.     -   Single binary and Multiple process: Multiple processes spawned         from a single binary, split the work among them and encrypt a         limited set only, thereby evading frequency-based detections.     -   Multiple binary and Multiple process: A binary is dedicated for         each operation. Hence it is highly evasive.

Behavioral detection unit 200 passes files whose event sequences have been determined to be a ransomware threat to file-traversal-information detection unit 205, which determines the information corresponding to the file traversal pattern using Application Program Interface (API) based traversal and/or New Technology File System (NTFS) based traversal. Detection unit 205 supports an algorithm that detects path-traversal attacks, which exploit security flaws associated with user-supplied file names. Algorithms for preventing path-traversal attacks are well known so a detailed discussion is omitted.

The file traversal pattern alone may be insufficient to determine whether a file or files is affected by ransomware because a desired process may traverse files in a similar manner. If the file traversal pattern is suspect, then the process moves to true-root-path detection unit 210 for further consideration.

Files are typically organized in a file system, a data structure that the operating system uses to control how files are stored in memory. The file system separates data into named pieces, giving each piece a file name and location in memory. File systems typically allow files to be grouped in folders, also called directories, that can themselves include subfolders (subdirectories) with their own files and subfolders. The folder/subfolder relationship is analogous to parent/child, with the uppermost folder called the root.

The location of a file within a file system is specified as a string of characters called a “folder path” that specifies the folders and subfolders that contain the file. For example, the folder path for a text file Notes.txt stored on a disk drive designated with a drive letter “C” might be expressed as C:\Users\Test\Documents\Notes. In this example, the root folder C includes subfolders Users, Test, and Documents. Folder and file names are delineated using colon and backslash characters in this embodiment, but other characters and formats can be used.

Ransomware tends to modify many disparate files across a file system, among myriad directories and subdirectories, and this pattern of action can aid in ransomware detection. However, many genuine processes also modify disparate files across large numbers of directories and subdirectories, “genuine” here referring to processes that are intended by the user or consistent with the user's goals. True-root-path detection unit 210 distinguishes between the file-modification patterns of ransomware and genuine processes to improve ransomware detection and reduce false positives.

Whenever a file is modified by a process, true-root-path detection unit 210 looks up the path of the folder containing the file and compares this path with those of other files modified by the same process. A popular CHROME web browser developed by Google LLC exemplifies a genuine process, the executable chrome.exe, that can modify files within many folders. The following Table 1 lists twenty-nine paths associated with an instance of process chrome.exe, each path identifying a folder that can include one or more files that can be acted upon by the process.

-   1 C:\Users\test\AppData\Local\Google\Chrome\UserData\AutofillStates -   2 C:\Users\test\AppData\Local\Google\Chrome\UserData\BrowserMetrics -   3     C:\Users\test\AppData\Local\Google\Chrome\UserData\CertificateRevocation -   4     C:\Users\test\AppData\Local\Google\Chrome\UserData\ClientSidePhishing -   5 C:\Users\test\AppData\Local\Google\Chrome\UserData\Crashpad -   6 C:\Users\test\AppData\Local\Google\Chrome\UserData\CrowdDeny -   7 C:\Users\test\AppData\Local\Google\Chrome\UserData\Default -   8     C:\Users\test\AppData\Local\Google\Chrome\UserData\DesktopSharingHub -   9     C:\Users\test\AppData\Local\Google\Chrome\UserData\FileTypePolicies -   10 C:\Users\test\AppData\Local\Google\Chrome\UserData\Floc -   11 C:\Users\test\AppData\Local\Google\Chrome\UserData\GrShaderCache -   12 C:\Users\test\AppData\Local\Google\Chrome\UserData\GuestProfile -   13 C:\Users\test\AppData\Local\Google\Chrome\UserData\hyphen-data -   14 C:\Users\test\AppData\Local\Google\Chrome\UserData\MEIPreload -   15 C:\Users\test\AppData\Local\Google\Chrome\UserData\Notification     Resources -   16     C:\Users\test\AppData\Local\Google\Chrome\UserData\OnDeviceHeadSuggestModel -   17     C:\Users\test\AppData\Local\Google\Chrome\UserData\OptGuidePredictionModels -   18 C:\Users\test\AppData\Local\Google\Chrome\UserData\OptHints -   19 C:\Users\test\AppData\Local\Google\Chrome\UserData\OriginTrials -   20 C:\Users\test\AppData\Local\Google\Chrome\UserData\pnacl -   21 C:\Users\test\AppData\Local\Google\Chrome\UserData\profile1 -   22 C:\Users\test\AppData\Local\Google\Chrome\UserData\profile2 -   23 C:\Users\test\AppData\Local\Google\Chrome\UserData\Recovery     Improved -   24 C:\Users\test\AppData\Local\Google\Chrome\UserData\Safe Browsing -   25 C:\Users\test\AppData\Local\Google\Chrome\UserData\SafetyTips -   26 C:\Users\test\AppData\Local\Google\Chrome\UserData\ShaderCache -   27     C:\Users\test\AppData\Local\Google\Chrome\UserData\SSLErrorAssistant -   28     C:\Users\test\AppData\Local\Google\Chrome\UserData\SubresourceFilter -   29 C:\Users\test\AppData\Local\Google\Chrome\UserData\SwReporter

Table 1: List of multiple folders accessed by Chrome.exe

True-root-path detection unit 210 reviews the file paths associated with a given process to find one or more of “true root folders” for the process. True root folders contain all the files a process acts upon and tend to be at lower levels in the file system hierarchy. In the example of Table 1, the folders “Google”, “Chrome”, and “UserData” are all specified by relatively low-level, subfolder paths and contain all the files chrome.exe is acting upon. True-root-path detection unit 210 narrows these down to identify the path to folder “Chrome” (C:\Users\test\AppData\Local\Google\Chrome) as the true root folder using an algorithm that distinguishes folder levels based on folder creation times. Genuine processes may have more than one true root folder, but the number will likely be low.

Ransomware processes tend to attack files across file hierarchies. Examining files acted upon by a ransomware thus yields a relatively large number of true root folders. True-root-path detection unit 210 counts the number of true root folders for each process under consideration and compares the number to a folder threshold. Processes that produce a number of true root folders above the threshold are flagged as ransomware or potential ransomware. In one embodiment, the true root folder threshold is three.

The following is an exemplary algorithm true-root-path detection unit 210 deploys to find the true root folder(s) for a given process. Other embodiments can work differently.

Algorithm:

find(true_root_of_a_file (file_or_folder_to_check)) { Get file_or_folder_to_check Get parent_folder(file_or_folder_to_check) Get parent_folder(parent_folder(file_or_folder_to_check)) Get folder_creation_time(parent_folder(file_or_folder_to_check)) Get folder_creation_time(parent_folder(parent_folder(file_or_folder_to_ check))) Diff   folder_creation_time(parent_folder(file_or_folder_to_check)) - folder_creation_time(parent_folder(parent_folder(file_or_folder_to_check))) if Diff >= Level_threshold then { true_root_of_a_file(file_or_folder_to_check) ← parent_folder(file_or_folder_to_check) Return true_root_of_a_file(file_or_folder_to_check) } else { file_or_folder_to_check  parent_folder(file_or_folder_to_check) Repeat find true_root_of_a_file (file_or_folder_to_check) } }

Sample Input Path:

-   C:\Users\test\AppData\Local\Google\Chrome\User\Data\RecoveryImproved\cache.txt

Sample Output Path (True Root Folder):

-   C:\Users\test\AppData\Local\Google\Chrome

True-root-path detection unit 210 finds the true root folder(s) for every file accessed by a process. The entire directory path is fed as input. The foregoing algorithm considers each folder's creation time, which is noted by the operating system at time of creation. Detection unit 210 fetches the creation times of folders at e.g. levels N and N-1 in a directory path folder using API calls and compares the times. If the time difference exceeds a level threshold, or time threshold, then folder level N is considered a stopping point. The path the folder at level N is returned as the true root folder. If the time difference does not exceed the level threshold, detection unit 210 continues along the directory path until the true root folder is found. Other embodiments use different path characteristics to detect ransomware.

FIG. 3 is a flowchart 300 of a method of finding the true root folder(s) for a single file in accordance with one embodiment of detection unit 210. The operating system supports system utilities that detection unit 210 calls upon to monitor real-time process/thread activity. The WINDOWS 10 operating system (Windows OS) from Microsoft, Inc., for example, provides kernel-level callbacks that can be used to monitor process creation and file changes to inform system 100 of an executing process. The operating system is then called upon to provide a list of files being acted upon by the process under consideration. Again using the example of WINDOWS 10, a process explorer or data collector (kernel driver) can provide process and file events to be used in creating the requisite list of files. The procedure of flowchart 300 starts and is repeated for each file of a process under consideration, providing a subfolder path for the file. Subfolder paths common to multiple files are consolidated into true-root paths, or folders.

In step 305, detection unit 210 gets the paths for a file F. Next, in steps 310 and 315, detection unit 210 gets the file's parent folders F1 and F2, the latter being the direct parent of the former. The folder creation times for folders F1 and F2 are read (steps 320 and 325) and the time difference calculated (step 330). Per decision 335, if the time difference falls below a level threshold, the folder level is incremented (step 340) and the loop repeats for the next level in the folder hierarchy of the file's path. If the time difference exceeds the threshold, detection unit 210 outputs the path associated with the parent folder as the subfolder path that is the true root folder for the file (step 345). The method is then complete for that file. The method is repeated for each file associated with the process under consideration.

Identical subfolder paths for different files from flowchart 300 are consolidated into one true-root path. In the example of Table 1, all files acted upon by chrome.exe share the same true-root path C:\Users\test\AppData\Local\Google\Chrome. The true-root-path count for process chrome.exe would thus be one. True-root-folder detection unit 210 compare the true-root-path count with a root-path threshold. If the number is greater than the root-path threshold, then files associated with the process under considerations are suspected of to be or to be infected by malware. Chrome.exe, having a count of just one, would not raise suspicion.

Files of suspect processes can be treated to additional inspection to avoid false positives. Recalling that ransomware threatens to encrypt the affected files, detection engine 130 can feed files associated with a suspect process to neural-network-based classifier 215, a linear neural-network model with five layers of neurons in this embodiment. Classifier 215 applies a machine-learning algorithm to classify each file as either encrypted or not encrypted. Encrypted files further suggest a ransomware threat. The sequence of events corresponding to the suspect file (such as “File_Delete”, “File_Modified”, “File_Rename”) are put into high priority queue 135 for further consideration.

Returning to FIG. 1 , events in high-priority queue 135 are passed to response engine 140, which is configured by the system administrator to perform one of three operations: 1) Detect, 2) Detect and kill, and 3) Detect, kill, and back-up recovery. Response engine 140 notifies server 110 about events posing ransomware threats and about the suitable configured response performed. Server 110 may store the events posing ransomware threats for future investigation.

A simple pattern for a ransomware process is to read and modify a file or files. This pattern is also exhibited by genuine processes. In the foregoing example, the process chrome.exe reads and modifies cookie and cache files for webpages visited by a user. Such actions can make a genuine application appear to be a ransomware threat, thereby increasing the false-positive rate. The increased false-positive rate poses a major challenge. Some embodiments deploy filters to further reduce false positives. These filters include:

-   -   1. Affected file discovered via Directory traversal         -   1.1. API based directory traversal         -   1.2. NTFS system-based traversal     -   2. Finding total number of unique process paths (e.g. unique         “true root folders”) for each process.     -   3. Applying secondary analysis to detect if a file in victim's         endpoint has been encrypted, to conclude if a ransomware threat         has been posed. This is performed using the neural network-based         classifier.

FIG. 4 is a flowchart 400 depicting a method of detecting ransomware in accordance with one embodiment of system 100 of FIG. 1 . Detection engine 130 receives event sequences from event reader 125 (step 405). Per decision 410, events and event sequences that are not associated with ransomware are ignored, while those consistent with ransomware behavior are passed to decision 415 where they are analyzed for signals of a path-traversal attack. If there are no such signs, the event or event sequence is ignored. If there are signs of a path-traversal, the process moves to decision 420 for further consideration.

Per decision 420, detection engine 130 ignores processes that yield fewer than some folder threshold (e.g. three) of true root folders, passing suspicious files to decision 425. A neural network is applied for decision 425 to detect whether a suspicious file is encrypted. If not, the event sequence is ignored. If so, then detection engine 130 signals ransomware detection (step 430) and performs some configured response to address the threat (step 435).

Ransomware often attempts to delete backup files. Windows OS includes backup and recovery software called Volume Snapshot Service (VSS). System 100 periodically requests the VSS to save snapshots of files—shadow copies—including when the files are in use. The latest snapshots can be used to recover corrupted or lost files, helpful in the event of a ransomware attack. A major issue in relying on behavioral ransomware detection is that by the time a threat is detected, some of the user's files may have already been encrypted. One embodiment protects backup storage using Component Object Model (COM) Application Programming Interface (API) hooking and a Kernel Input/Output Request Packet (IRP) filtering method that prevent ransomware from deleting shadow copies so that encrypted files can be recovered.

FIG. 5 depicts a ransomware-protection system 500 instantiated on a computer, a user endpoint that executes processes using one or more processors and memory to modify files and store them in secondary storage 505 (e.g. on a solid-state or hard-disk drive). Processes relevant to this discussion are shown diagrammatically. A user layer 510 shows processes that execute at the user level and a kernel layer 515 shows processes that execute within the operating system. All user software communicates with hardware, including secondary storage 505, via kernel layer 515.

User layer 510 includes a VSS operation requestor 520 that represents a volume-snapshot service tasked with saving shadow copies by invoking a system call to operating-system kernel 515 to access secondary storage 505. A malware VSS operation requester 520 can issue file deletion requests to VSS 520 using e.g. command-line instructions via an administrative process vssadmin.exe 525, code or script via a utility process wmic.exe 530, or directly via the COM API 535.

Windows OS comes with a shadow-copy process vssvc.exe in place of VSS 520. Like VSS 520, process vssvc.exe includes a COM receiver 540 that can receive deletion requests and pass them to a shadow-copy-delete routine 545 that makes the call to kernel 515. VSS 520 is like shadow-copy process vssvc.exe but modified to include a hook 550 to intercept requests to delete shadow copies. COM receiver 540 implements a Windows interface called “COM API,” which refers to the Component Object Model Application Programming Interface. Hook 550 augments the COM API by intercepting function calls or messages.

There are many API hooking methods available. In Windows, for example, a Dynamic Link Library (DLL) is injected into a VSS process from the kernel. During process creation, the injected DLL is loaded along with other DLLs. Since DLLs are injected from the kernel, it is difficult to control the DLL load order. When two DLLs are injected, for example, the first one can be injected into a VSS process, which hooks NtQuerySystemInformation API, which in turn loads the second DLL, which hooks the DeleteSnapshots COM API. DeleteSnapshots COM API address is calculated using vssvc.pdb. To hook a non-exposed API, the address at which the API resides is calculated using the pdb file and that address is used for hooking.

Kernel 515 includes a filter driver 555 that prevents shadow-copy delete or modification requests that bypass VSS 520, e.g. using a direct drive access (IOCTL command) to a file-system driver 560, or via a shadow-storage resize. Filter driver 555 prevents deletion requests from reaching file system driver 560, a conventional component of the Windows OS, and thus foils efforts by ransomware to prevent recovery of encrypted or otherwise lost files.

Filter driver 555, in one embodiment, employs Kernel Input Output Request-Packet (IRP) filtering to protect shadow copies from deletion methods that bypass VSS 520, methods like direct-device access and shadow-storage resize. Filter driver 555 can load above or below a device driver to capture IOCTL control codes that are sent to a storage volume (e.g. secondary storage 505). Filter driver 555 can block delete requests from VSS 520 that would otherwise delete shadow copies. Hook 550 is nevertheless included so that VSS 520 maintains synchronization with kernel 515. VSS 520 can, for example, maintain a record of shadow copies in secondary storage 505. Hook 550 can alert VSS 520 that a request is to be ignored, leaving filter driver 555 to block the request. Alternatively, VSS 520 can itself block the request.

Variations of these embodiments, and variations in usage of these embodiments, including separate or combined embodiments in which features are used separately or in any combination, will be obvious to those of ordinary skill in the art. Therefore, the spirit and scope of the appended claims should not be limited to the foregoing description. In U.S. applications, only those claims specifically reciting “means for” or “step for” should be construed in the manner required under 35 U.S.C. § 112(f). 

1. A method for detecting ransomware from a process acting on files organized in folders of a file system, each of the files specified by a file path, the method comprising: for each of the files acted upon by the process, identifying a subfolder path in the file path of each of the file; combining the subfolder paths into a number of unique process paths; and comparing the number of unique process paths to a folder threshold.
 2. The method of claim 1, further comprising issuing a ransomware-warning responsive to the comparing if the number of unique process paths exceeds the folder threshold.
 3. The method of claim 1, wherein each of the folders has a folder-creation time, and wherein identifying the subfolder path for each of the files comprises comparing the folder-creation times of the folders specified in the subfolder path.
 4. The method of claim 3, wherein identifying the subfolder path for each of the files further comprises comparing a difference between the folder-creation times with a time threshold.
 5. The method of claim 1, further comprising checking whether the files are encrypted.
 6. The method of claim 5, wherein the checking comprises applying a machine-learning algorithm to the files.
 7. The method of claim 5, further comprising issuing a ransomware-warning responsive to the comparing if the number of unique process paths exceeds the folder threshold and at least some of the files are encrypted.
 8. A computer system for detecting ransomware from a process acting on files organized in folders of a file system, each of the files specified by a file path, the system comprising: an event detection engine to detect events that threaten the files; and a response engine that responds to each of the detected events, the response engine, for each of the files threatened by one of the detected events: identifying a subfolder path in the file path of each of the file; combining the subfolder paths into a number of unique process paths; and comparing the number of unique process paths to a folder threshold.
 9. The computer system of claim 8, the response engine further issuing a ransomware-warning responsive to the comparing if the number of unique process paths exceeds the folder threshold.
 10. The computer system of claim 8, wherein each of the folders has a folder-creation time, and wherein identifying the subfolder path for each of the files comprises comparing the folder-creation times of the folders specified in the subfolder path.
 11. The computer system of claim 10, wherein identifying the subfolder path for each of the files further comprises comparing a difference between the folder-creation times with a time threshold.
 12. The computer system of claim 8, the response engine further checking whether the files are encrypted.
 13. The computer system of claim 12, wherein the checking comprises applying a machine-learning algorithm to the files.
 14. The computer system of claim 13, the response engine further issuing a ransomware-warning responsive to the comparing if the number of unique process paths exceeds the folder threshold and at least some of the files are encrypted. 15-17. (canceled)
 18. A ransomware-protection system instantiated on a computer system including a processor and memory to execute a shadow-copy process that invokes a system call to an operating-system kernel to write a shadow copy of a file in the memory to secondary storage, the shadow-copy process including a hook to intercept a request to alter the shadow copy of the file in the secondary storage.
 19. The ransomware-protection system of claim 18, the operating-system kernel including a filter to intercept a second request to alter the shadow copy of the file in the secondary storage.
 20. The ransomware-protection system of claim 19, wherein the second request comprises a request packet.
 21. The ransomware-protection system of claim 18, wherein the request to alter the shadow copy comprises at least one of a delete request and an overwrite request.
 22. The ransomware-protection system of claim 18, wherein the hook declines the request to alter the shadow copy of the file.
 23. The ransomware-protection system of claim 22, wherein the hook informs the shadow-copy process that the request to alter the shadow copy of the file was declined.
 24. The ransomware-protection system of claim 18, wherein the hook informs the shadow-copy process that the request to alter the shadow copy of the file will be declined. 25-29. (canceled) 